Exploratory Analysis: Descriptive Statistics and Clustering of MFPCA scores for Ecological, Climate and Soil Data


Descriptive Analysis

In order to get a first impression of the recovery trajectories for patches disturbed between 2015 and 2040, this section provides a brief description of the pre-processed data set. For each scenario, the disturbed grid cells are scattered across the boreal biome. The recovery trajectories comprise 434 disturbed grid cells for scenario Control, 442 for SSP1-RCP2.6, while for SSP3-RCP7.0 and SSP5-RCP8.5 there are 462 and 465 grid cells, respectively. This adds up to 1803 disturbed patches in total. Since four independent runs of the vegetation model LPJ-GUESS are considered, the grid cells disturbed in each scenario may vary or may coincide. In total, 1577 unique grid cells are disturbed, where 226 of them are disturbed in at least two scenarios.


Soil Variables

In order to get an intuition of the soil in the boreal biome, Figure 1 and Figure 2 show boxplots of the soil composition and further soil attributes, respectively. Evaluating the mean values and quantiles at the textural triangle reveals the soil to be mostly loam, sandy loam and loamy sand. The soil attributes do not show a lot of variation in the data: bulk density is on average close to \(1.3\) g/cm³, while round about \(4.2\) g/kg of organic carbon is stored in the soil on average. The mean pH level is at \(5.6\), which indicates a slightly acid environment.


Figure 1: Soil composition.
Figure 1: Soil composition.

Figure 2: Soil attributes
Figure 2: Soil attributes

Climate Variables

To gain insight into the climatic conditions at the disturbed grid cells, Figure 3 shows the annual mean, minimum and maximum temperature as well as the summed precipitation averaged over all disturbed grid cells for each scenario between 2015 and 2140. Note that the temperatures are converted to the unit °C for easier interpretation. All three temperature curves reveal the same patterns: the three warming scenarios behave similarly in the years 2015 to 2050 with all three scenarios facing an increase in average temperature (upper left). Thereafter, the mean temperature, as well as the minimum and maximum temperatures, decrease for the SSP1-RCP2.6 scenario, while they continue to increase for the SSP3-RCP7.0 and SSP5-RCP8.5 scenarios until 2100. This pattern is also present when annual precipitation is considered. While the behaviour is similar for the two most extreme scenarios, the SSP1-RCP2.6 scenario experiences a lot of precipitation at the beginning of the study period, which decreases over the time period. Note that for both temperature and precipitation there are no changes in the trends for the Control. \


Figure 3: Yearly mean, minimum and maximum temperature and precipitation averaged over all disturbed grid cells for each scenario.
Figure 3: Yearly mean, minimum and maximum temperature and precipitation averaged over all disturbed grid cells for each scenario.

Ecological Variables

Nitrogen uptake is an indicator of vegetation growth. Figure 4 shows the nitrogen uptake per PFT and scenario (left), and the total uptake scenario-wise averaged over all disturbed grid cells (right). The former figure suggests that there is no trend in uptake in the Control scenario, with the highest values for the PFTs Tundra and Needleleaf evergreen. In contrast, the warming scenarios show nitrogen dynamics that become more established with increasing radial forcing. While Tundra and Needleleaf evergreen still dominate nitrogen uptake in the early decades of the study period, takes the lead in all three scenarios afterwards. In addition, the uptake for Temperate broadleaf increases in the last decades of the period for the two extreme scenarios SSP3-RCP7.0 and SSP5-RCP8.5. Note that the nitrogen uptake of the PFT Conifers (other) remains almost constant in all four scenarios. Interestingly, when looking at the total nitrogen uptake (right), the general pattern is very similar to that of the temperature curves in Figure 3. While the Control remains at a constant level throughout the study period, nitrogen uptake in the three warming scenarios first increases in the early decades and then decreases before reaching a certain level in the SSP1-RCP2.6 scenario. The values for the two more extreme scenarios continue to increase until 2100 and then remain constant. This increase in total nitrogen uptake is also reflected in the left plot, taking into account the increase in total values across all scenarios.


Figure 4: Nitrogen uptake per PFT and scenario averaged over all disturbed grid cells (left) and total nitrogen uptake of the grid cell for each scenario averaged over all disturbed grid cells (right).
Figure 4: Nitrogen uptake per PFT and scenario averaged over all disturbed grid cells (left) and total nitrogen uptake of the grid cell for each scenario averaged over all disturbed grid cells (right).

The three remaining ecological variables are depicted in the following. The number of new seedlings per PFT immediately after disturbance, i.e.,~the initial recruitment realised as Poisson process, visualized as boxplots in Figure 5 reveals that Needleleaf evergreen and Tundra are dominant, which remains true for ten years after the disturbance (Figure 6). While the number of expected seedlings of PFTs Pioneering broadleaf and Conifers (other) is of minor importance directly after the disturbance, their numbers of new seedlings almost disappear after ten years. It is noteworthy that there are only minor differences between the three warming scenarios, while the Control seems to achieve a lower number of new seedlings in general. Looking at the vegetation composition before the disturbance, transformed into the relative proportions shown in Figure 7, highlights the dominance of Needleleaf evergreen. In all four scenarios, its share is the largest of all the PFTs, but it also shows the most variation in the data. Again, there are no major differences between the scenarios. Note that even in the most extreme scenario SSP5-RCP8.5 the PFT Temperate broadleaf does not play a role in the composition.


Figure 5: Number of new seedlings immediately after the disturbance.
Figure 5: Number of new seedlings immediately after the disturbance.

Figure 6: Number of new seedlings ten years after the disturbance (summed up).
Figure 6: Number of new seedlings ten years after the disturbance (summed up).

Figure 7: Vegetation composition prior to the disturbance.
Figure 7: Vegetation composition prior to the disturbance.

Recovery

To get a first impression of recovery curves, Figure 8 shows the recovery trajectories for all four scenarios. The bold curve corresponds to the PFT-wise mean. Note that these curves do not result from a basis representation, but the interpolated individual data points. The PFT composition clearly follows similar patterns. In all scenarios, Tundra dominates in the majority of grid cells in the first years after disturbance. After a short peak, its share decreases and other PFTs take over. Especially the dominant vegetation after 100 years differs between the scenarios. The more extreme the increase in radial forcing, the more dominant Pioneering broadleaf becomes, while the importance of Needleleaf evergreen decreases. PFT Temperate broadleaf becomes more present in the more extreme scenarios, but is not able to displace needleleaf trees and Pioneering broadleaf. In addition, Figure 9 shows the mean differences between the warming scenarios and the Control, i.e., the difference between the average relative carbon values of each warming scenario and the corresponding Control values. The figure reveals a rather uniform behavior with more Pioneering broadleaf Temperate broadleaf than in the Control on the one hand, and a smaller proportion of the remaining PFTs on the other one after a few decades of recovery.


Figure 8: Recovery trajectories for all PFTs and scenarios for patches disturbed between 2015 and 2040.
Figure 8: Recovery trajectories for all PFTs and scenarios for patches disturbed between 2015 and 2040.

Figure 9: Differences in vegetation composition relative to scenario Control. Values above zero correspond to a higher share of the specific PFT than in the Control and vice versa.
Figure 9: Differences in vegetation composition relative to scenario Control. Values above zero correspond to a higher share of the specific PFT than in the Control and vice versa.

Clustering

Recall that the 1806x100x5 data set was downscaled by a MFPCA and a 4-means algorithm was applied to the first 10 PC scores. The four resulting clusters are now further analyzed in terms of the above established new variables at hand. The following table provides an overview of the number of grid cells in each cluster:


Number of curves in each cluster after 100 years of recovery
Control SSP1-RCP2.6 SSP3-RCP7.0 SSP5-RCP8.5 Sum
Cluster 1 47 177 204 224 652
Cluster 2 83 50 55 49 237
Cluster 3 68 46 58 65 237
Cluster 4 236 169 145 127 677
Sum 434 442 462 465 1803

As a reminder, Figure 10 shows the PFT-wise average shares of above ground carbon.


Figure 10: PFT-wise mean shares of above ground carbon over time. Note that the values are averages over locations and clusters.
Figure 10: PFT-wise mean shares of above ground carbon over time. Note that the values are averages over locations and clusters.

Soil Variables

Figure 11 shows the soil composition and soil attributes for each cluster and all four scenarios. While there are only minor differences between the clusters in terms of silt, cluster 3 and 4 tend to comprise less clay and more sand than clusters 1 and 2. Interestingly, looking at Figure 10, cluster 1 and 2 are both dominated by Pioneering broadleaf, which may explain the similar behavior. Considering the soil attributes (second row) reveals no substantial differences in terms of average value. Cluster 1 seems to capture more variation in the data for scenario SSP1-RCP2.6.


Figure 11: Soil properties for each cluster and all four scenarios.
Figure 11: Soil properties for each cluster and all four scenarios.

Climate Variables

In order to investigate if the clusters cover different temperature and/or precipitation values, Figure 12 shows the yearly mean, minimum and maximum temperature and summed up precipitation per cluster averaged over all grid cells. Here, the curves are independent of the scenario, i.e., the average is taken of all grid cells in the specific cluster regardless of the scenario.

Cluster 1 clearly covers grid cells with the highest mean and minimum temperature and most precipitation throughout most of the study period. The differences for the other three clusters are rather small. For the maximum temperature however, clusters 1 and 3 show a similar behavior, while clusters 2 and 4 tend to have a smaller annual maximum temperature.


Figure 12: Yearly mean, minimum and maximum temperature and precipitation per cluster averaged over all grid cells (independent of the scenario).
Figure 12: Yearly mean, minimum and maximum temperature and precipitation per cluster averaged over all grid cells (independent of the scenario).

So far, the grid cells were considered regardless of the scenario in which they were disturbed. Figures 13, 14, 15 and 16 take a look at the scenario-specific averages of mean, minimum and maximum temperature as well as precipitation.

For the yearly mean temperature (Figure 13) there are only minor differences between the clusters detectable. In the Control scenario, cluster 1 seems to have the largest annual mean temperature in the second half of the study period. In scenario SSP1-RCP2.6, cluster 3 dominates in terms of temperature. For the remaining two scenarios there are no clear differences between the scenarios.


Figure 13: Yearly mean temperature per cluster and scenario averaged over scenario-specific grid cells.
Figure 13: Yearly mean temperature per cluster and scenario averaged over scenario-specific grid cells.

Looking at the annual minimum temperature depicted in Figure 14 reveals similar patterns than in Figure 13, but they tend to bee more pronounced. Now, cluster 3 comprises grid cells with lowest minimum temperature for both SSP3-RCP7.0 and SSP5-RCP8.5.


Figure 14: Yearly minimum temperature per cluster and scenario averaged over scenario-specific grid cells.
Figure 14: Yearly minimum temperature per cluster and scenario averaged over scenario-specific grid cells.

Figure 15 shows the annual maximum temperature per scenario averaged over disturbed grid cells. Interestingly, while cluster 3 is the coldest cluster in SSP3-RCP7.0, it is also the warmest cluster in terms of maximum temperature in that very scenario. This implies major temperature changes. For the remaining scenarios, only minor differences between the clusters become apparent.


Figure 15: Yearly maximum temperature per cluster and scenario averaged over scenario-specific grid cells.
Figure 15: Yearly maximum temperature per cluster and scenario averaged over scenario-specific grid cells.

The annual mean precipitation in Figure 16 highlights again the outstanding of cluster 3: while in SSP1-RCP2.6 it is the cluster with most of the precipitation, in both SSP3-RCP7.0 and SSP5-RCP8.5 it is the cluster with the lowest precipitation. Overall, the curves are again very similar among the clusters.


Figure 16: Yearly precipitation (summed up) per cluster and scenario averaged over scenario-specific grid cells.
Figure 16: Yearly precipitation (summed up) per cluster and scenario averaged over scenario-specific grid cells.

In total, all four provided climate variables do not seem to be major drivers in the delineation of clusters.


Ecological Variables

First, let’s take a look at nitrogen uptake. High values are indicative of biological productivity and good soil conditions, and may imply growth and adaptation of individual PFTs. Here, we look at two different perspectives: Figure 17 shows the nitrogen uptake per PFT, but averaged over PFT, grid cell and scenario. In contrast, Figure 18 depicts the total nitrogen uptake averaged over grid cells and scenarios.


Figure 17: Nitrogen uptake averaged over all disturbed grid cells.
Figure 17: Nitrogen uptake averaged over all disturbed grid cells.

The results are surprising: while cluster 1 is the group with the largest uptake in both perspectives, there is a clear difference for the other three clusters. Cluster 4, the largest cluster, has the second largest uptake considering PFT-wise nitrogen, while for the total nitrogen uptake, cluster 2 is second.


Figure 18: Total nitrogen uptake averaged over all disturbed grid cells.
Figure 18: Total nitrogen uptake averaged over all disturbed grid cells.

Tu further explore these differences, Figure 19 shows the PFT-wise nitrogen uptake per PFT and scenario, again averaged over grid cells. Interestingly, in the Control, mainly Tundra contributes to the nitrogen uptake. Cluster 1 seems to cover smaller uptakes than the other clusters. For the warming scenarios, especially Pioneering broadleaf is crucial. Here, a clear separation between the clusters is only apparent in scenario SSP5-RCP8.5: Cluster 3 comprises grid cells with lower uptake of Pioneering broadleaf, but higher uptake of Temperate broadleaf. This corresponds to Figure 10, where the high mean shares of aboveground carbon of cluster 3 are already indicated.


Figure 19: Nitrogen uptake per PFT and scenario averaged over all disturbed grid cells.
Figure 19: Nitrogen uptake per PFT and scenario averaged over all disturbed grid cells.

Looking at the total nitrogen uptake per grid cell scenario-wise averaged over disturbed grid cells depicted in Figure 20 does not reveal any major differences between the clusters. In total, looking at the PFT-wise nitrogen uptake, some cluster specific trends are present, but overall, the differences between the clusters are rather small.


Figure 20: Total nitrogen uptake per scenario averaged over all disturbed grid cells.
Figure 20: Total nitrogen uptake per scenario averaged over all disturbed grid cells.

Now, the number of new seedlings immediately after the disturbance is under examination. There is no major difference between the cluster in terms of total sum of new seedlings as Figure 21 indicates:


Figure 21: Number of new seedlings per cluster right after disturbance averaged over all disturbed grid cells.
Figure 21: Number of new seedlings per cluster right after disturbance averaged over all disturbed grid cells.

Figure 22 shows the number of new seedling per PFT right after the disturbance, again averaged over all disturbed grid cells. There is no clear pattern detectable for the different clusters. For PFT Needleleaf evergreen, cluster 1 and 2 seem to comprise grid cells with a smaller number of new seedlings as for cluster 3 and 4 for the first thee scenarios. Interestingly, the number of new seedlings for Pioneering broadleaf and Temperate broadleaf is close to zero for the Control. Note that for these PFTs, the variation varies substantially within the clusters.


Figure 22: Number of new seedlings per PFT and cluster right after disturbance averaged over all disturbed grid cells.
Figure 22: Number of new seedlings per PFT and cluster right after disturbance averaged over all disturbed grid cells.

Looking at the number of new seedlings summed up over the first ten years after the disturbance depicted in Figure 23 shows that cluster 3 tends to comprise more seedlings than the other three clusters. The clear separation of dominant vegetation that is visible in Figure 10 is not quite reflected in the PFT-wise number of new seedlings in Figure 24. Again, the differences are small between the clusters and do not follow any specific patterns.


Figure 23: Number of new seedlings per cluster 10 years after disturbance (summed up) averaged over all disturbed grid cells.
Figure 23: Number of new seedlings per cluster 10 years after disturbance (summed up) averaged over all disturbed grid cells.

Figure 24: Number of new seedlings per PFT and cluster 10 years after disturbance (summed up) averaged over all disturbed grid cells.
Figure 24: Number of new seedlings per PFT and cluster 10 years after disturbance (summed up) averaged over all disturbed grid cells.

Lastly, let’s take a look at the vegetation composition before the disturbances appeared. Figure 25 shows the aboveground carbon prior to disturbance for all four clusters. Clearly, no major differences become apparent, the carbon seeems rather balanced between clusters.


Figure 25: Vegetation composition before the disturbance per cluster.
Figure 25: Vegetation composition before the disturbance per cluster.

In order to investigate, which PFT was dominant before the disturbance happened, Figure 26 shows the PFT-wise vegetation composition. The variation if Pioneering broadleaf seems to be quite variable, especially in contrast to the other PFTs. Again, the differences between the clusters are rather small with no clear pattern.


Figure 26: Vegetation composition before the disturbance per PFT and cluster.
Figure 26: Vegetation composition before the disturbance per PFT and cluster.